113 research outputs found

    Graph Triangulations and the Compatibility of Unrooted Phylogenetic Trees

    Get PDF
    We characterize the compatibility of a collection of unrooted phylogenetic trees as a question of determining whether a graph derived from these trees --- the display graph --- has a specific kind of triangulation, which we call legal. Our result is a counterpart to the well known triangulation-based characterization of the compatibility of undirected multi-state characters

    Improved Lower Bounds on the Compatibility of Multi-State Characters

    Full text link
    We study a long standing conjecture on the necessary and sufficient conditions for the compatibility of multi-state characters: There exists a function f(r)f(r) such that, for any set CC of rr-state characters, CC is compatible if and only if every subset of f(r)f(r) characters of CC is compatible. We show that for every r≥2r \ge 2, there exists an incompatible set CC of ⌊r2⌋⋅⌈r2⌉+1\lfloor\frac{r}{2}\rfloor\cdot\lceil\frac{r}{2}\rceil + 1 rr-state characters such that every proper subset of CC is compatible. Thus, f(r)≥⌊r2⌋⋅⌈r2⌉+1f(r) \ge \lfloor\frac{r}{2}\rfloor\cdot\lceil\frac{r}{2}\rceil + 1 for every r≥2r \ge 2. This improves the previous lower bound of f(r)≥rf(r) \ge r given by Meacham (1983), and generalizes the construction showing that f(4)≥5f(4) \ge 5 given by Habib and To (2011). We prove our result via a result on quartet compatibility that may be of independent interest: For every integer n≥4n \ge 4, there exists an incompatible set QQ of ⌊n−22⌋⋅⌈n−22⌉+1\lfloor\frac{n-2}{2}\rfloor\cdot\lceil\frac{n-2}{2}\rceil + 1 quartets over nn labels such that every proper subset of QQ is compatible. We contrast this with a result on the compatibility of triplets: For every n≥3n \ge 3, if RR is an incompatible set of more than n−1n-1 triplets over nn labels, then some proper subset of RR is incompatible. We show this upper bound is tight by exhibiting, for every n≥3n \ge 3, a set of n−1n-1 triplets over nn taxa such that RR is incompatible, but every proper subset of RR is compatible

    Inferring Species Trees from Incongruent Multi-Copy Gene Trees Using the Robinson-Foulds Distance

    Get PDF
    We present a new method for inferring species trees from multi-copy gene trees. Our method is based on a generalization of the Robinson-Foulds (RF) distance to multi-labeled trees (mul-trees), i.e., gene trees in which multiple leaves can have the same label. Unlike most previous phylogenetic methods using gene trees, this method does not assume that gene tree incongruence is caused by a single, specific biological process, such as gene duplication and loss, deep coalescence, or lateral gene transfer. We prove that it is NP-hard to compute the RF distance between two mul-trees, but it is easy to calculate the generalized RF distance between a mul-tree and a singly-labeled tree. Motivated by this observation, we formulate the RF supertree problem for mul-trees (MulRF), which takes a collection of mul-trees and constructs a species tree that minimizes the total RF distance from the input mul-trees. We present a fast heuristic algorithm for the MulRF supertree problem. Simulation experiments demonstrate that the MulRF method produces more accurate species trees than gene tree parsimony methods when incongruence is caused by gene tree error, duplications and losses, and/or lateral gene transfer. Furthermore, the MulRF heuristic runs quickly on data sets containing hundreds of trees with up to a hundred taxa.Comment: 16 pages, 11 figure

    Extracting Conflict-free Information from Multi-labeled Trees

    Get PDF
    A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation in MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.Comment: Submitted in Workshop on Algorithms in Bioinformatics 2012 (http://algo12.fri.uni-lj.si/?file=wabi

    A Simple Characterization of the Minimal Obstruction Sets for Three-State Perfect Phylogenies

    Get PDF
    Lam, Gusfield, and Sridhar (2009) showed that a set of three-state characters has a perfect phylogeny if and only if every subset of three characters has a perfect phylogeny. They also gave a complete characterization of the sets of three three-state characters that do not have a perfect phylogeny. However, it is not clear from their characterization how to find a subset of three characters that does not have a perfect phylogeny without testing all triples of characters. In this note, we build upon their result by giving a simple characterization of when a set of three-state characters does not have a perfect phylogeny that can be inferred from testing all pairs of characters
    • …
    corecore